Semantic Metrics, Conceptual Metrics, and Ontology Metrics: An Analysis of Software Quality Using IR-based Systems, Potential Applications and Collaborations
نویسنده
چکیده
Similarities and differences between “semantic metrics” (metrics defined on a knowledge-based IR system) and “conceptual metrics” (metrics defined on a Latent Semantic Indexing-based IR system) are discussed. Potential collaboration areas between research groups are identified. Potential application and collaboration areas of a new research area called “ontology metrics,” metrics calculated on the ontologies that form part of an ontology-based software system, are also discussed. Currently ontology metrics are calculated using techniques similar to semantic metrics, but other semantically-based expansions, including some similar to conceptual metrics are possible. Application areas of Information Retrieval (IR) based approaches have in the past included concept location [13], clone detection [8], traceability link recovery [11], and software reuse [5,6], among others. One relatively new application area is the following: in the past few years, the use of IR based techniques has been extended to examine software quality, through metrics based on the output of IR-based systems. This new field, introduced by Etzkorn et al. [7,8] and furthered by Stein et al. [16,17,18] and Cox et al.[3], and called “semantic metrics,” has recently been extended by Marcus and Poshyvanyk [12,20], and by them called “conceptual metrics.” “Conceptual metrics” are different from “semantic metrics” in that semantic metrics are defined in the context of comment and identifier analysis within a program understanding system based on knowledge-based natural language understanding, whereas the “conceptual metrics” are defined in the context of comment and identifier analysis within a program understanding system based on Latent Semantic Indexing. Latent Semantic Indexing (LSI) is a corpus-based statistical method for representing the meaning of natural language words and passages (sentences, paragraphs, etc.). With LSI, meanings are derived from usage rather than from a predefined dictionary[11]. Both types of IR-based metrics, semantic metrics and conceptual metrics, have the advantage that they can be used to measure some aspects of software design, such as cohesion, which are difficult to measure [7] using traditional syntactic metrics [2] based solely on counting items in the source code. Both types of IR-based metrics could also be used to analyze design qualities in, for example, software design documents [18] as well as in source code, which would expand metrics capability far beyond that provided by most traditional syntactic metrics. The advantages and disadvantages between semantic metrics and conceptual metrics would seem to be primarily based on the differences between the underlying natural language-based program understanding systems: knowledge-based or LSI based. For example, one primary disadvantage of a knowledge-based natural language understanding system is the necessity to create knowledge bases for each new domain examined (or possibly one large knowledge base to cover several domains). An additional related disadvantage of the knowledge-based system approach would be a potential lack of scalability due to the difficulties involved in word sense disambiguation in large knowledge-bases. Therefore, the LSI approach is possibly more easily scalable than the knowledge based system approach. (We note here, however, that there are a number of variations of Word Sense Disambiguation (WSD) techniques used for IR that do not depend on complete knowledge bases. Ozcan and Aslandogan [15] performed a study that employed various WSD and LSI techniques separately with success.) However, LSI can be criticized for not making use of word order, morphology, or the syntactic relations between words [13], so although overall it has shown to work well [13], potentially it could work better on some input data than others. For example, Deetwester et al. [4] found a substantial improvement over traditional term (keyword style) indexing using latent semantic indexing in one corpus (13% improvement on the MED corpus) of two that they tested, while they found no improvement on the other corpus (the CISI corpus). They suggested the poor performance in the CISI corpus was potentially due to the homogeneous nature of the dataset. However, in areas where both semantic metrics and conceptual metrics can be collected easily (data areas where both knowledge-based program understanding and LSI-based program understanding will easily operate; for example, where knowledge bases are available and the corpus involved is nonhomogeneous), a comparison between semantic metrics and conceptual metrics intended to measure the same software qualities would be interesting. Each type of metric has, in the past, been used to measure the same quality. For example, the software quality “cohesion” has been measured using semantic metrics [3,17] and conceptual metrics [12]. In these papers, the semantic and conceptual metrics have been compared to existing syntactic metrics, but not (to the author of this paper’s knowledge) to each other. One interesting expansion of both kinds of metrics would be to investigate applying variations of already proposed semantic or conceptual metrics to comment and identifier understanding that employs other variations of IR, and preferably industrial strength IR implementations, or possibly combinations of existing IR approaches. Furthering this idea would be an examination of how previously defined semantic metrics and cognitive metrics would be defined differently when based on different kinds of IR systems. For example, semantic metrics are defined in the context of a knowledge-base loosely based on the conceptual graph representation format [7]. How would the definition of the semantic metrics change when defined on different knowledge representation formats, for example, a knowledgebase defined in the Resource Description Framework (RDF) knowledge representation format? Similar questions could potentially be raised for conceptual metrics if applied to different kinds of IR systems. This question leads into another research area, that of metrics to measure ontology-based systems, otherwise known as ontology metrics, or ontology-based metrics [14,19]. In ontology-based systems, ontologies provide a common, shared interface definition between applications. One example application area is internet-based systems providing Business to Business or Business to Consumer services. These types of service-oriented architectures, as well as other web based systems, often employ ontology data in a description of web services. Such ontology data is often represented in Resource Description Format (RDF), and stored in XML [14]. The same kinds of software processes are used to develop ontology-based systems as are used for more traditional software systems that do not employ ontologies. However, an integral part of an ontology-based system is the development of an ontology. Therefore, an important part of measuring the quality of an ontology-based system is measuring the quality of the ontologies themselves. Recently, Orme, Yao, and Etzkorn [14] and Yao, Orme, and Etzkorn [19] examined ontology cohesion and coupling, using techniques similar to the calculation of the earlier semantic metrics within the knowledge base of the program understanding system, but in this case the calculations were performed directly on the ontologies themselves, which were stored in RDF format. The intent of this work was to examine ontology cohesion and coupling as they might affect a software system that employed those ontologies. Other work, by Burton-Jones, et al. [1], examined ontology quality of individual ontologies based on a semiotic framework derived from linguistics theory that includes general elements of quality based on linguistic-style concepts related to the analysis of signs and symbols. However, the work of BurtonJones et al. was not directly targeted toward the use of these ontologies in a software system. In addition to web services employed for internet based systems, bioinformatics and genomics researchers also make heavy use of ontologies. Orme, Yao, and Etzkorn [14] performed a bioinformatics case study in which they examined the integration of existing ontologies to reduce failures that could occur in the runtime use of these ontologies. Substantial work remains in ontology metrics, particularly validating the metrics in various application areas, but also determining how these metrics can best be employed in software maintenance on the new kinds of software applications that make heavy use of ontologies. As mentioned earlier, the ontology metrics of Orme, Yao, and Etzkorn [14] and Yao, Orme, and Etzkorn [19] were calculated similarly to how semantic metrics are calculated within the knowledge base of a program understanding system. However, other more advanced semantic techniques are possible (in the case of these ontology metrics as currently defined, an IR-based program understanding system is not employed; rather, the metrics are calculated based on the relationships as specified in the ontologies themselves). Recently, Kotis, Vouros, and Stergiou [9] examined the automated mapping/merging of ontologies using the meaning of concepts by mapping them to WordNet senses using LSI. This kind of approach could lead to conceptual metrics to measure ontology qualities. Similarly, combinations of the original ontology metrics of Orme, Yao, and Etzkorn [14] and new conceptual metrics for ontologies could potentially be useful.
منابع مشابه
Building a Comprehensive Conceptual Framework for Power Systems Resilience Metrics
Recently, the frequency and severity of natural and man-made disasters (extreme events), which have a high-impact low-frequency (HILF) property, are increased. These disasters can lead to extensive outages, damages, and costs in electric power systems. A power system must be built with “resilience” against disasters, which means its ability to withstand disasters efficiently while ensuring the ...
متن کاملWater Quality Restoration Using Landscape Metrics Analysis: A Case Study in the Golestan Province of Iran
Abstract The results of an integrated study aimed at restoring water quality in a large watershed including seven catchments in north east Iran are presented in this paper. This case study demonstrates how landscape metrics reflect direct or surrogate causes of the land use practices that are the determinants of water quality parameters. Water quality factors included EC, pH, Cl-1, HCO3-1, SO4-...
متن کاملReview of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کاملMeasuring design complexity of semantic web ontologies
Ontology languages such as OWL are being widely used as the Semantic Web movement gains momentum. With the proliferation of the Semantic Web, more and more large-scale ontologies are being developed in real-world applications to represent and integrate knowledge and data. There is an increasing need for measuring the complexity of these ontologies in order for people to better understand, maint...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کامل